perm filename APSEM[AP,DBL]1 blob sn#147277 filedate 1975-02-26 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00003 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	.DEVICE XGP
C00006 00003	5↓_Evolution_↓*
C00023 ENDMK
C⊗;
.DEVICE XGP
.!XGPCOMMANDS←"/TMAR=100/PMAR=2000/BMAR=100"

.FONT 1 "BASL30"
.FONT 2 "BASB30"
.FONT 3 "NGR25"
.FONT 4  "BASI30"
.FONT 5  "NGR40"
.FONT 6  "FIX25"
.FONT 7  "NGR20"
.FONT 8  "GRFX35"
.TURN ON "↓_π{"
.TURN ON "⊗" FOR "%"
.PAGE FRAME 42 HIGH 89 WIDE
.AREA TEXT LINES 1 TO 41
.AREA FOOTING LINE 42
.!XGPLFTMAR←120
.SPACING 120 MILLS
.PREFACE 300 MILLS
.NOFILL
.PREFACE 85 MILLS
.FILL
.COUNT PAGE PRINTING "1"
.MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
.MACRO BB ⊂ BEGIN NOFILL SELECT 3 INDENT 0 GROUP ⊃
.MACRO E ⊂ APART END ⊃
.MACRO D ⊂ ONCE PREFACE 100 MILLS ⊃
.MACRO BBB ⊂ BEGIN  INDENT 0  PREFACE 100 MILLS  SPACING 45 MILLS ⊃
.TABBREAK

.EVERY FOOTING(,⊗7{PAGE}⊗*,)
.TURN OFF "{}"
.PAGE←0
.NEXT PAGE
.GROUP SKIP 2
.ONCE CENTER
⊗5↓_NONFORMAL PROGRAM SYNTHESIS BY INTERACTING EXPERTS_↓⊗*

.GROUP SKIP 3
.ONCE CENTER
⊗5↓_Evolution_↓⊗*

⊗5PW1⊗*:  Accept a couple examples, plug elementary functions into a typical
recursive function schema until it produces the desired i/o results. The
method used here is almost pure enumeration.

Results: Unfortunately, it worked.
You ⊗4can⊗* get most of the simple recursive
LISP functions this unintelligent way (Reverse, Flatten, Fibonacci, Merge, Sort, etc.)

Conclusions: One must
turn to harder programs, probably in some domain, and include some expert 
knowledge from that domain.

Interim decision: What would be the domain of the synthesized programs? 
Answer: consider inductive inference programs.  Obvious difficulties
(size and sophistication of the programs) but four big advantages (...).

⊗5SEW⊗*: The simplest type of inductive inference task is sequence extrapolation.
This system was supposed to write various sequence extrapolators, carrying on a
fairly rigid dialogue with the user.  Internally, there was a huge schema for a
sequence extrapolator. At each uninstantiated slot, there was a little program
which indicated how to fill this in, how to extend it if it were already filled
in, how to check that it was filled in enough to be meaningful, what other slots
had to be filled in before this one could be completed, and so on. 
If the alternative contents of the slot were (at least partially known), then
they would all be stored there as well.
The human user
was continually presented with choices, based upon these constraints. 

Results: Unfortunately, it worked.
People are not very good at sequence extrapolation; they have a few different
kinds of operations they can perform, a few specific examples of each kind,
and can only compose them about three or four deep.  The entire domain's knowledge
can thus be expressed as the huge schema chosen, and every slot can have a set of
alternatives. That is, almost all the sequence extrapolater programs which are at
or elow human ability, can be written using a very rigid format and a very small
amount of intelligence. All the user does is make a couple dozen choices, and his
sequence extrapolater is turned out.

Conclusions: One must turn to still harder programs, ⊗4and/or⊗* insist that the system
synthesize its targets in such a way that not only is the final code correct, but
every step in the synthesis is reasonable.

Interim decision: Before turning to synthesizing big, sophisticated programs 
the "right" way, we'd better be sure we can even generate a few toy programs
that way.  Stop concentrating on the final results, and worry about how the internal
mechanisms act. 

⊗5PUP1⊗*:
Communication with the human user was primitive; he types in a description of the
current state and of the desired state of some part of the world, and the system
assumes the first description is true and works to produce the final description.
This system was written in QLISP, which utilizes pattern-directed function invocation.
The user must phrase his request as a recognizable pattern, one which matches some
of PUP1's functions' goal templates. For example, he might say that the current world
satisfies (CONTENTS A x) and (CONTENTS B y), and his goal might be
(SIMULTANEOUSLY  (CONTENTS A y) (CONTENTS B x)). 

The function AND-GOAL, the one which recognizes (SIMULTANEOUSLY  ←←anything),
always adopts the following strategy:
.NOFILL
	1. select one of the final conditions in $anything; call it α, and call the others β.
	2. acheive α.   
	3. if β is Null, then return success; else acheive (SIMULTANEOUSLY β).
			Failing here again causes (1.) to try for a new α and β.
	4. check that α is still true
.FILL
Notice that if you fail during a step, QLISP will reexecute (1.) and get a new α and
β. If there are no more ways to match (←α ←←β) to $anything, then (1.) itself fails
which means that AND-GOAL has failed. 

The function ASSIGN recognizes the goal (CONTENTS ←c ←v). It demands that there be
an assertion of the form (CONTENTS ←d $v), that is, some cell must already contain
the value desired. Then it simply returns (SETQ $c $d), and asserts that the
contents of the written-into cell are no longer what they once were. If no cell
is found which has the desired contents $v, then the program's history is searched
specifically for an assertion of the form "the contents of cell ←e were $v but were
replaced by the value ←z". This will always succede an assignment (SETQ $e $z).
What ASSIGN does in this case is to put a new assignment just before the
destructive one, saving the desired value in a temporary cell. It first sets up the
goal (NEWNAME F), meaning that the value of F will be a brand new name. When some
function satisfies this goal, ASSIGN inserts the saving assignment, succeded by the
comment "About to destroy the value $v of cell $e, $v is needed later so it is
saved in the new cell $F ".  It then asserts that the contents of the new cell are
$v, and recursively poses the original subgoal, (CONTENTS $c $v).

Let us quickly run through how these two functions, AND-GOAL and ASSIGN, solve the
problem originally put forward: reverse the contents of cells A and B.
AND-GOAL matches the main goal, puts forth the new subgoal (CONTENTS A y).
ASSIGN's template matches this, asks for a cell ←d satisfying (CONTENTS ←d y),
and the QLISP system finds such an assertion, binding d to B. ASSIGN then places
the following into the (currently NIL) synthesized program:
.NOFILL
  BEGIN PROGRAM
  "the contents of cell A were x but were replaced by the value y" 
  (SETQ A B)
  END PROGRAM
.FILL
ASSIGN returns this, and AND-GOAL now proposes the new subgoal (CONTENTS B x).
ASSIGN responds to this, but can't find any existing cell containing an x. It
searches the program, and sure enough finds the place where a cell containing
x was overwritten. It gets a new name, say TEMP1, and the program is now changed to:
.NOFILL
  BEGIN PROGRAM
  "about to destroy the value x of cell A,  x is needed later so it is saved in the new cell TEMP1"     
  (SETQ TEMP1 A)  
  "the contents of cell A were x but were replaced by the value y" 
  (SETQ A B)
  END PROGRAM
.FILL
The assertion (CONTENTS TEMP1 x) is made, and ASSIGN confidantly proposes its
original goal again, (CONTENTS B x); this is in case there is more than one function
who knows how to handle this type of goal. In  PUP1, the only recognizer is ASSIGN,
who asks for a cell with contents x, gets one (namely TEMP1), and then adds
(SETQ B TEMP1) to the tail of the program. It returns, and AND-GOAL checks that
(CONTENTS A y) is still satisfied. It is, so AND-GOAL return the final program.
The only executable steps are (SETQ TEMP1 A) (SETQ A B) (SETQ B TEMP1).

While this works, many issues have been smoothed over. One issue is why it creates
a brand new cell to store the needed value; why not use the cell it is eventually
going to store it in, or why not use some other existing cell which is not
currently known to be needed. Of course in the exchange case, both of these
strategies would fail; the former might lead into an infinite loop, in fact.

Results: The scant knowledge about LISP, lists, and assignment was encoded directly
and opaquely into the PUP1 functions. They were able to generate all the programs
originally expected (including the exchange, turning a specific list into a sorted
list, producing a desired list by taking existing ones and performing Car, Cdr,
Cons, Append, Rplaca, Rplacd).  The steps taken in each case seemed reasonable, but
what appeared to>be caution was ususally stupidity (e.g., getting a new cell in the
example just presented).  No programs outside this class could even be meaningfully
posed to PUP1.

Conclusions: The system isn't made smarter by 
disguising its stupidity to appear like caution.
Both this hangup and the limited range might go away if the domain knowledge were
made more accessable and plentiful.

Interim decision: Enlarge the tasks that PUP can do to include simple recursive
functions.  Synthesize these not by specialized, opaque functions, but rather by
utilizing a wealth of domain-specific knowledge, uniformly represented.








Decision to store knowledge by topics, and within a topic grouping by
specific usage. For example, one topic might be MAXIMIZE, and it would have a
property list with properties like Extreme, Efficiency, Algorithm, Domain,
Range, etc.  Under each property would be either a program or a declarative
assertion, which provided an answer to that aspect of the topic of Maximizing.
Part of one of these little programs might be a call to another part of the
same or a different topic.  
This system was written in QLISP, which utilizes pattern-directed function invocation.
nondeterministic; each subtopic was provided with a goal template, and when some
subtopic wanted something, it phrased that as a goal; the subtopics which responded
were those whose goal template matched  the current goal's pattern.  In a minute,
we'll work through an example of how PUP1 produced the SQRT program.
The user must phrase his request as a recognizable pattern.


Results: Many low-level programs were in fact generated; these include stacking
blocks, exchanging the values of two LISP atoms, simple list transformations, and
the integral-square-root program.  The two big defects of the system were: (i)
when a new program P is supposed to be generated, unless P is similar to one already
synthesized, the amount of modifications to PUP1 to get it to produce P are actually
bigger than P itself.  One might have to add two pages of LISP code to have it
write a new four-line function. (ii) The other big problem was the interrelations
in the knowledge. In order to insert some new information, I had to be "on top
of" the whole knowledge base; I had to know exactly what information existed, and
in what format. Although the QLISP pattern-directed function invocation theoretically
eliminates this difficulty, in practice there were just enough functions -- whose
templates were just sufficent -- to generate all the planned target programs (and
a few very similar ones).  


PUP6

MATH